Resource Usage Estimation of Data Stream Processing Workloads in Datacenter Clouds

نویسندگان

  • Alireza Khoshkbarforoushha
  • Rajiv Ranjan
  • Raj Gaire
  • Prem Prakash Jayaraman
  • John G. Hosking
  • Ehsan Abbasnejad
چکیده

Real-time computation of data streams over affordable virtualized infrastructure resources is an important form of data in motion processing architecture. However, processing such data streams while ensuring strict guarantees on quality of services is problematic due to: (i) uncertain stream arrival pattern; (ii) need of processing different types of continuous queries; and (iii) variable resource consumption behavior of continuous queries. Recent work has explored the use of statistical techniques for resource estimation of SQL queries and OLTP workloads. All these techniques approximate resource usage for each query as a single point value. However, in data stream processing workloads in which data flows through the graph of operators endlessly and poses performance and resource demand fluctuations, the single point resource estimation is inadequate. Because it is neither expressive enough nor does it capture the multi-modal nature of the target data. To this end, we present a novel technique which uses mixture density networks, a combined structure of neural networks and mixture models, to estimate the whole spectrum of resource usage as probability density functions. The proposed approach is a flexible and convenient means of modeling unknown distribution models. We have validated the models using both the linear road benchmark and the TPC-H, observing high accuracy under a number of error metrics: mean-square error, continuous ranked probability score, and negative log predictive density.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using WSRM to Track SQL Server’s Resource Usage

Microsoft’s Windows System Resource Manager (WSRM) is a workload management tool included with Windows Server 2003 Enterprise or Datacenter. Administrators can use WSRM to control how CPU and memory are shared among competing processes. WSRM is typically used to normalize consolidated workloads (reducing the risk that a misbehaving application will interfere with others on the system), or to en...

متن کامل

Moonwalk: NRE Optimization in ASIC Clouds or, accelerators will use old silicon

Cloud services are becoming increasingly globalized and data-center workloads are expanding exponentially. GPU and FPGA-based clouds have illustrated improvements in power and performance by accelerating compute-intensive workloads. ASIC-based clouds are a promising way to optimize the Total Cost of Ownership (TCO) of a given datacenter computation (e.g. YouTube transcoding) by reducing both en...

متن کامل

SLA-based virtual machine management for heterogeneous workloads in a cloud datacenter

Efficient provisioning of resources is a challenging problem in cloud computing environments due to its dynamic nature and the need for supporting heterogeneous applications. Even though VM (Virtual Machine) technology allows several workloads to run concurrently and to use a shared infrastructure, still it does not guarantee application performance. Thus, currently cloud datacenter providers e...

متن کامل

SLA-Based Resource Provisioning for Heterogeneous Workloads in a Virtualized Cloud Datacenter

Efficient provisioning of resources is a challenging problem in cloud computing environments due to its dynamic nature and the need for supporting heterogeneous applications with different performance requirements. Currently, cloud datacenter providers either do not offer any performance guarantee or prefer static VM allocation over dynamic, which lead to inefficient utilization of resources. E...

متن کامل

A Performance Study of Big Data on Small Nodes

The continuous increase in volume, variety and velocity of Big Data exposes datacenter resource scaling to an energy utilization problem. Traditionally, datacenters employ x8664 (big) server nodes with power usage of tens to hundreds of Watts. But lately, low-power (small) systems originally developed for mobile devices have seen significant improvements in performance. These improvements could...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1501.07020  شماره 

صفحات  -

تاریخ انتشار 2015